Supervised Word Sense Disambiguation with Support Vector Machines and multiple knowledge sources
نویسندگان
چکیده
We participated in the SENSEVAL-3 English lexical sample task and multilingual lexical sample task. We adopted a supervised learning approach with Support Vector Machines, using only the official training data provided. No other external resources were used. The knowledge sources used were partof-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. For the translation and sense subtask of the multilingual lexical sample task, the English sense given for the target word was also used as an additional knowledge source. For the English lexical sample task, we obtained fine-grained and coarse-grained score (for both recall and precision) of 0.724 and 0.788 respectively. For the multilingual lexical sample task, we obtained recall (and precision) of 0.634 for the translation subtask, and 0.673 for the translation and sense subtask.
منابع مشابه
An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation
In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include Support Vector Machines (SVM), Naive Bay...
متن کاملKernel Methods for Word Sense Disambiguation and Acronym Expansion
The scarcity of manually labeled data for supervised machine learning methods presents a significant limitation on their ability to acquire knowledge. The use of kernels in Support Vector Machines (SVMs) provides an excellent mechanism to introduce prior knowledge into the SVM learners, such as by using unlabeled text or existing ontologies as additional knowledge sources. Our aim is to develop...
متن کاملUSP-IBM-1 and USP-IBM-2: The ILP-based Systems for Lexical Sample WSD in SemEval-2007
We describe two systems participating of the English Lexical Sample task in SemEval2007. The systems make use of Inductive Logic Programming for supervised learning in two different ways: (a) to build Word Sense Disambiguation (WSD) models from a rich set of background knowledge sources; and (b) to build interesting features from the same knowledge sources, which are then used by a standard mod...
متن کاملIt Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text
In this paper, we present IMS, a supervised English all-words WSD system. The flexible framework of IMS allows users to integrate different preprocessing tools, additional features, and different classifiers. By default, we use linear support vector machines as the classifier with multiple knowledge-based features. In our implementation, IMS achieves state-of-the-art results on several SensEval...
متن کاملThe WSD Development Environment
In this paper we present the Word Sense Disambiguation Development Environment (WSDDE), a platform for testing various Word Sense Disambiguation (WSD) technologies, as well as the results of first experiments in applying the platform to WSD in Polish. The current development version of the environment facilitates the construction and evaluation of WSD methods in the supervised Machine Learning ...
متن کامل